On the relation between logarithmic series model and other superpopulation models useful for microdata disclosure risk assessment
نویسندگان
چکیده
Fisher’s logarithmic series model (Fisher et al. (1943)) is a classical model in statistical ecology. In this paper we show that this model is a key model linking three models discussed in Takemura (1997), i.e., Poisson-gamma model (Bethlehem et al. (1990)), Dirichlet-multinomial model (Takemura (1997)), and Ewens model (Ewens (1990)). This connection opens up the possibility of applying existing techniques of statistical ecology to the problem of microdata disclosure risk assessment.
منابع مشابه
Assessing Microdata Disclosure Risk Using the Poisson-inverse Gaussian Distribution
An important measure of identification risk associated with the release of microdata or large complex tables is the number or proportion of population units that can be uniquely identified by some set of characterizing attributes which partition the population into subpopulations or cells. Various methods for estimating this quantity based on sample data have been proposed in the literature by ...
متن کاملBinomial Mixture Conditioning " Law of Small Numbers " Limit " Law of Small Numbers " Limit Dirichlet - Multinomial Poisson - Gamma Ewens Logarithmic Series
Fisher's logarithmic series model (Fisher et al. (1943)) is a classical model in statistical ecology. In this paper we show that this model is a key model linking three models discussed in Takemura (1997), i.e., Poisson-gamma model (Bethlehem et al. (1990)), Dirichlet-multinomial model (Takemura (1997)), and Ewens model (Ewens (1990)). This connection opens up the possibility of applying existi...
متن کاملNegative Binomial Mixture Conditioning
Fisher's logarithmic series model (Fisher et al. (1943)) is a classical model in statistical ecology. In this paper we show that this model is a key model linking three models discussed in Takemura (1997), i.e., Poisson-gamma model (Bethlehem et al. (1990)), Dirichlet-multinomial model (Takemura (1997)), and Ewens model (Ewens (1990)). This connection opens up the possibility of applying existi...
متن کاملSome superpopulation models for estimating the number of population uniques
The number of the unique individuals in the population is of great importance in evaluating the disclosure risk of a microdata set. We approach this problem by considering some basic superpopulation models including the gamma-Poisson model of Bethlehem et al. (1990). We introduce Dirichlet-multinomial model which is closely related but more basic than the gamma-Poisson model, in the sense that ...
متن کاملAssessing the Protection Provided by Misclassification-based Disclosure Limitation Methods for Survey Microdata
Government statistical agencies often apply statistical disclosure limitation techniques to survey microdata to protect the confidentiality of respondents. There is a need for valid and practical ways to assess the protection provided. This paper develops some simple methods for disclosure limitation techniques which perturb the values of categorical identifying variables. The methods are appli...
متن کامل